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School Reform in Philadelphia: 

A Comparison of Student Achievement at Privately- Managed Schools 
with Student Achievement in Other District Schools 



Paul E. Peterson 
(Executive Summary) 

No Child Left Behind (NCLB) asks that states “restructure” schools that fail for 
six years running to make Adequate Yearly Progress (AYP) toward full proficiency on 
the part of all students by the year 2014. According to the legislation, restructuring may 
entail the transformation of the school into a charter school, the shift of management to a 
private entity (either for-profit or non-profit), or a number of other options. Although 
restructuring efforts by most states have been modest, Pennsylvania, in the summer of 
2002, directed the School District of Philadelphia to undertake substantial restructuring of 
its 66 lowest performing schools under the overall direction of the Philadelphia School 
Reform Commission (SRC). The schools were contracted out to for-profit management 
organizations, to non-profit organizations, or assigned to be restructured by a newly 
created Office of Restructured Schools (ORS), a special office within the school district 
itself. Since the schools assigned to the privately-managed entities were the lowest 
performing in the district, they were asked, especially, to reduce the percentage of 
students performing below the basic level. 

The policy intervention in Philadelphia raises questions of general interest: Can 
private managers lift the percentage of students performing up to a basic level of 
proficiency as well or better than schools managed by a school district? Can they 
increase the percentage of students performing at state-defined proficiency levels? Does 
the competition stimulated by contracting out some schools to private management raise 
performance district- wide? Are the benefits of the reform effort worth the costs? 

The Philadelphia reforms do not provide an ideal test for answering these 
questions. Too many restrictions were placed on the private managers, who were not 
given a representative sample of schools to oversee. Instead, they were assigned the 
lowest-performing schools in the district. But if definitive answers cannot be given, 
some information can be gleaned from student test-score performance available for the 
period covering the four years since the reforms were instituted. 

Using publicly available evidence concerning student test-score performance 
between 2002 and 2006, 1 tracked the performance of two cohorts of 5 th graders for three 
years to see whether, by 8 th grade, those attending at elementary and middle schools 
under private management learned more than students at 8 ORS schools and more than 
students in the district as a whole, as indicated by their performance on the Pennsylvania 
State System of Assessment (PSSA), a high-stakes test used for accountability purposes 
under NCLB. 




All schools for which I was able to follow a cohort of 5 th graders through 8 th grade 
were included in the analysis. The number of privately-managed and ORS schools 
tracked in the first cohort was 16 and 4, respectively. Those numbers increased to 19 and 
8 for the second cohort. 

Pennsylvania reports the changes in the percentage of students performing on the 
PSSA at four different levels: 1) below basic, 2) basic proficiency, 3) full proficiency and 
4) advanced. 

I found that the private providers were especially effective at increasing the 
percentage of students performing at or above the basic level. In reading, for example, 
the improvement for the second cohort of students (2003-06) was 25 percent at the 
schools managed by privately-managed organizations, as compared to only 15 percent at 
ORS schools and 17 percent for the district as a whole. In math, the percentage change 
was 23 percent at the schools managed by the privately-managed schools, as compared to 
12 percent at the ORS schools and 15 percent district-wide. Similarly disproportionate 
gains were generally observed at the privately-managed schools for the first cohort of 
students (2002-05). 

Students at privately-managed schools were as effective as other schools in the 
district at bringing 5 th students up to fully proficient levels of performance by 8 th grade, 
despite the fact that student test scores were initially performing at significantly lower 
levels. As compared to ORS schools, they made larger gains in math and similar gains in 
reading. In interpreting these results, it must be kept in mind that students at the for-profit 
schools were especially disadvantaged, and they had to improve by a larger margin in 
order to reach proficiency levels. 

Before 2006, only 5 th and 8 th grade test scores on the PSSA were used for 
accountability purposes under NCLB. As a result, comparable information is not 
available for students at other grade levels. 

My results differ from those reported by a team of scholars associated with the 
RAND Corporation and Research for Action (Gill, Zimmer, Christman, and Blanc, 2006, 
hereinafter referred to as RAND-RFA), a policy-oriented research group in Philadelphia. 
They found students at privately-managed schools learning no more than other students 
in the school district. However, their analysis, though presented as a quasi-experiment, 
estimates levels of achievement, not gains in achievement. RAND-RFA says that it must 
estimate levels, for as many as half the students included in its analysis only two test 
scores (rather than the three necessary to conduct a quasi-experiment) are available. Also, 
an unknown number of students are observed only post-treatment. As a result, the study 
fails to adjust appropriately for student background characteristics (e. g., race, ethnicity, 
family background) and peer-group composition. Moreover, it does not focus 
exclusively on state-mandated PSSA performance, but on a mix of high-stakes and low- 
stakes tests that can be equated only by making strong assumptions about test 
equivalency. As a result, it is at risk of under-estimating the impact of private 
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management organizations on student performance, which were assigned the most 
disadvantaged schools in the Philadelphia school district. 

Reconciliation of the differences between the two studies is best done by allowing 
other qualified scholars to analyze the data now available only to the RAND Corporation, 
a common practice with data collected by government agencies. In the absence of that 
information, an analysis of trends in cohort performance provide the best estimate on the 
effectiveness of privately-managed schools that can be made from publicly available 
data. 



Whether or not competition stimulated a rise in district-wide performance could 
not be determined. Philadelphia test scores have risen substantially, but the data are not 
available to ascertain whether those gains exceed the ones achieved in other comparable 
school districts. RAND-RFA’s attempt to address this question compares schools that 
have strikingly different student populations. 

It was also not possible to conduct a cost-benefit analysis, because the 
Philadelphia school district does not make available information on per-pupil 
expenditures by school. Statements about cost-effectiveness in the RAND-RFA study 
lack the supporting evidence on expenditures per pupil, and I was not able to obtain that 
information either. 




School Reform in Philadelphia: 

A Comparison of Student Achievement at Privately- Managed Schools 
with Student Achievement in Other District Schools 

Paul E. Peterson 

No Child Left Behind (NCLB) asks that states “restructure” schools that fail for 
six years running to make Adequate Yearly Progress (AYP) toward full proficiency on 
the part of all students by the year 2014. 1 According to the legislation, restructuring may 
entail the transformation of the school into a charter school, the shift of management to a 
private entity (either for-profit or non-profit), or a number of other options. Although 
many school districts have made only modest changes as part of their restructuring effort 
(Mead, 2007), Pennsylvania, in the summer of 2002, directed the School District of 
Philadelphia to undertake a substantial restructuring of its 66 lowest performing schools 
under the overall direction of the Philadelphia School Reform Commission (SRC). 

Thirty elementary and middle schools were contracted out to for-profit management 
organizations,” 15 schools were contracted out to non-profit organizations, and 21 
schools were assigned to be restructured by a newly created Office of Restructured 
Schools (ORS), a special office within the school district itself. 

The management of the restructured schools was to take place within the terms of 
collective bargaining agreements with employee unions. Other district policies were to 

1 Support for my research was provided by the Lynde and Harry Bradley Foundation, the 
John M. Olin Foundation and Edison Schools. Four peer reviewers critiqued earlier drafts 
of this research paper. Staff members at sponsoring institutions were invited to review the 
report for fact-checking purposes. Research results are the sole responsibility of the 
author. Mark Linnen and Daniel Nadler provided research assistance. 

“ A high school was contracted out to a for-profit provider in 2004-05. 




2 



remain in effect as well. The managers’ tasks were greatly complicated by the SRC 
decision to allow any teacher at the schools to transfer to another school in the district, if 
they so desired. Finally, the for-profit entities were asked to manage the schools that had 
the very lowest-performing students, and the students attending schools assigned to the 
non-profits had only marginally higher test score performances than those assigned to the 
for-profits. 

The policy intervention in Philadelphia raises questions of general interest: Can 
private managers reduce the percentage of students performing below a basic level as 
well as or better than district-managed schools? Can they increase the percentage of 
students performing at state-defined proficiency levels? Does the competition stimulated 
by contracting out some schools to private management raise performance district- wide? 
Are the benefits of the reform effort worth the costs? 

The Philadelphia reforms do not provide an ideal test for answering these 
questions. Too many restrictions were placed on the both types of managers, and neither 
was assigned a representative sample of schools to manage. But if definitive answers 
cannot be given, some information can be gleaned from student test-score performances 
over the four years since the reforms were instituted. 

Using publicly available evidence concerning student test-score performance 
between 2002 and 2006, 1 tracked the performance of two cohorts of 5 th graders at 
elementary and middle for three years to see whether, by 8 th grade, those attending 
schools under private management learned more than students at 8 ORS schools and 
more than other schools in the district, as indicated by their performance on the 
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Pennsylvania State System of Assessment (PSSA), a high-stakes test used for 
accountability purposes under NCLB. 1 

Pennsylvania reports the changes in the percentage of students performing on the 
PSSA at four different levels: 1) below basic, 2) basic proficiency, 3) full proficiency and 
4) advanced. 

I found that schools managed privately were more effective at lifting the 
percentage of students performing at the basic level or above than were either the ORS 
schools or schools in the district as a whole. In reading, for example, the improvement 
for the second cohort of students (2003-06) was 25 percent at the schools managed 
privately, as compared to only 15 percent at ORS schools and 17 percent for the district 
as a whole. In math, the percentages of students reading at or above the basic level 
jumped by 23 percentage points at the privately-managed schools, as compared to 12 
points at the ORS schools and 15 points district-wide. Similarly disproportionate 
improvements were generally observed at the privately-managed schools for the first 
cohort of students (2002-05). See Table 3. 

Students at privately-managed schools were as effective as other schools in the 
district at bringing 5 th students up to fully proficient levels of performance by 8 th grade, 
despite the fact that student test scores were initially performing at significantly lower 
levels. As compared to ORS schools, they made larger gains in math and similar gains in 
reading. (See Tables 1 and 3). 



The number of schools participating in the restructuring process changed from year to 
year. All schools for which I was able to follow a cohort of fifth graders up through 8 th 
grade are included in the analysis. That number differed for the two cohorts. 
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Because a high-stakes version of the PSSA was given only in 5 th and 8 th grades 
during the four year period under investigation, comparable information is not available 
for students at other grade levels. 

My results differ from those reported by a team of scholars associated with the 
RAND Corporation and Research for Action (Gill, Zimmer, Christman, and Blanc, 2006, 
hereinafter referred to as RAND-RFA), a policy-oriented research group in Philadelphia. 
They found students at privately-managed schools learning no more than other students 
in the school district. However, their analysis, though presented as a quasi-experiment, 
contains many students for whom there are only two test scores (rather than the three that 
are necessary), many of them observed only post-treatment. As a result, the study fails to 
adjust appropriately for student background characteristics (e. g., race, ethnicity, family 
background) and peer-group composition. Moreover, it does not focus exclusively on 
state-mandated PSSA performance, but on a mix of high-stakes and low-stakes tests that 
can be equated only by making strong assumptions about test equivalency. As a result, it 
is at risk of under-estimating the impact of private management on student performance, 
because they were assigned the most disadvantaged schools in the Philadelphia school 
district. 

Reconciliation of the differences between the two studies is best done by allowing 
other qualified scholars to analyze the data now available only to the RAND Corporation, 
a common practice with data collected by government agencies. In the absence of that 
information, an analysis of trends in cohort performance provide the best estimate on the 
effectiveness of privately-managed schools that can be made from publicly available 



data. 
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Whether or not competition stimulated a rise in district-wide performance could 
not be determined. Philadelphia test scores have risen substantially, but the data are not 
available to ascertain whether those gains exceed the ones achieved in comparable school 
districts. RAND-RFA’s attempt to address this question compares schools that have 
strikingly different student populations. 

It was also not possible to conduct a cost-benefit analysis, because the 
Philadelphia school district does not make available information on per-pupil 
expenditures by school. Statements about cost-effectiveness in the RAND-RFA study 
lack the supporting evidence on expenditures per pupil, and I was not able to obtain that 
information either. 

Pennsylvania State System of Assessment 

The PSSA is the primary vehicle for holding schools accountable for improving 
student learning in Philadelphia. On that matter, there is clear agreement among 
Philadelphia’s SRC, the State of Pennsylvania, and NCLB legislation. For example, SRC 
chairman, James Newel, hailed the rise in PSSA test scores in Philadelphia one year after 
the reforms had been introduced with the following public statement: “Results attest to 
the success of our reform measures. The hard work and dedication of our principals, 
teachers, support staff and students resulting in these strong gains is being celebrated 
today (School District of Philadelphia, 2004).” Similarly, the Pennsylvania State 
Secretary of Education made it clear that it was the low performance on the PSSA that 
was of greatest concern to him: “We had a situation where more than 150 schools had 
over 50 percent of their students performing at the below basic level on the PSSA’s. We 
believe that there was not the capacity on the ground to turn that situation around. We 
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needed outside expertise (Christman, Gold and Herold, 2006, p. 8).” Federal 
expectations, as defined by NCLB, were much the same. As Jolley Bruce Christman and 
her colleagues (2006, p. 13) put it: 

Looming NCLB sanctions strongly influenced the academic programs across all 
providers; providers had to align their curricula with the all-important 
Pennsylvania State System of Assessment (PSSA) exams, the state assessment 
used to measure schools’ progress toward NCLB’s Adequate Yearly Progress 
(AYP) targets. 

In other words, expectations for managers were clear. Raise test score 
performance on the PSSA, especially the performance of those who were performing 
below basic levels of proficiency. 

Measuring Student Performance 

Despite the widespread agreement on the need to boost student performance on 
the PSSA, Pennsylvania’s system of measuring school performance was, in 2002, still a 
work in progress. Although the PSSA was aligned to Pennsylvania’s curricular 
standards, only students in 5 th , 8 th ' and 1 1 th grades were being tested in reading and math. 
Since students in only three grades, separated by three years, were being given the PSSA, 
neither school administrators nor the public at large could track the annual progress of a 
particular cohort of students from one year to the next. Not until the spring of 2005 did 
schools begin testing students on the PSSA in grade 3, and not until 2006 did the third- 
grade exam become a part of the state accountability system. Grades 4, 6 and 7 were not 
tested until 2006. Thus, within elementary and middle schools, only students in 5 th and 
8 th grades were administered a high-stakes PSSA test before 2006. For students in these 
grades, however, the heavy emphasis that the federal government, the State of 
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Pennsylvania, and the SRC all were placing on the PSSA made it a test for which the 
stakes were high. 

The PSSA differed from two other tests given to some Philadelphia students 
between 2001 and 2006. In 2001 and 2002, students in grades 3, 4 and 7 were tested on 
the Stanford 9; between 2003 and 2005, the Terra Nova, designed by a different 
company, was used at most grade levels. The Stanford 9 and Terra Nova are nationally 
normed tests that have no particular alignment with Pennsylvania state standards. Nor 
were schools being held accountable to performance on these tests either by the state or 
under federal law. Thus, they are best understood as “low-stakes” tests not aligned with 
the Pennsylvania curricular standards. 

Some students, in some years, were given both the high-stakes and a low-stakes 
test. In 2006, the low-stakes Terra Nova was dropped, except for students in first and 
second grade. The PSSA was given to 3 rd graders in 2005, but it did not become a high- 
stakes test that was used for accountability purposes until 2006, the same year the PSSA 
was introduced in grades 4, 6, and 7. 

In all of this hodge-podge, one thing was quite clear: Performance on the PSSA is 
what mattered. It was student performance on this test that determined whether or not 
Philadelphia schools were making the federally required AY P. Administrators, teachers 
and most students were well aware of this fact. Whether they took either of the other two 
tests seriously is unknown. For that reason, my evaluation focuses on student 
performance on the PSSA, the test that counts in Pennsylvania. 
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Baseline Student Performance at For-Profit, Non-Profit and District Schools 

PSSA scores are available by school for 5 th and 8 th grades for the spring of 2002, 
immediately before private managers assumed their responsibilities. Those scores reveal 
the seriousness of the challenge the new managers were about to face. 

District-wide, math performance on the PSSA was very low across the 
Philadelphia school district (Table 1). Only 19 percent of 5 th graders were performing at 
the state-determined proficiency standard in the spring of 2002, while 59 percent were 
scoring below a basic level of proficiency. But those numbers, as discouraging as they 
were, exceeded by a substantial margin the level attained by 5th graders in the schools 
SRC assigned to the for-profit managers. Only 6 percent of those students were 
proficient, while 78 percent had scores below a basic level of proficiency. On the reading 
portion of the PSSA, the proficiency level at the schools managed by the for-profits was 
reached by just 9 percent of the students, as compared to 21 percent district- wide. The 
percentage performing below the basic level was 68 percent, as compared to 52 percent 
across the district. For 8 th graders, similar disparities between the schools assigned to the 
for-profit managers and district-wide performance were reported (Table 1). 

The schools SRC assigned to non-profits also had exceptionally low-scoring 
students, though not ones as seriously disadvantaged as those assigned to the for-profits. 
Among 5th graders, only 12 percent were proficient on the PSSA math test, while 68 
percent were scoring below a basic level of proficiency. In reading, those percentages 
were 13 percent and 61 percent, respectively. At the schools assigned to the district-run 
ORS for re-structuring, the degree of educational disadvantage among 5 th graders was 
similar to the schools assigned to the non-profit schools (See Table 1.) 
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Cohort Growth Estimations of Student Performance 

To assess the performance of private managers at these severely disadvantaged 
schools during the first four years after they assumed their responsibilities, I tracked the 
“growth” or progress of specific cohorts of students from one time period to another on 
the PSSA, as compared to the growth for those in the same cohort district- wide. I also 
compared the growth at the privately-managed schools to the schools that were 
restructured by ORS, the office of restructuring operated by the school district itself. 

The approach resembles the one proposed by many scholars, states and school 
districts as a way of improving NCLB accountability provisions (Hoff, 2006: Lee, 2004; 
Peterson and West, 2006). Currently, schools must show that the percentage of all 
students (and of relevant student subgroups) performing at proficient levels in math and 
reading exceeds statewide targets. These targets are scheduled to be raised at regular 
intervals until they reach 100 percent in 2014. That accountability method, many have 
pointed out, ignores the possibility the demographic profile of the school may change, as 
each succeeding class enters the school. 

A more appropriate methodology, many say, is to track the progress of individual 
students as they move through school, an approach that is often called a growth model. 
Since information for individual students is not publicly available for the Philadelphia 
school district, the best approximation (using publicly available data) is to track a cohort 
of students as it moves through school. For the period 2002 to 2006, it is possible to 
track the growth in performance on the PSSA of two cohorts of students over a three year 
period of time, because a high-stakes PSSA was administered both to 5 th graders and 8 th 
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graders. Cohort I attended 5 th grade in 2002 and 8 th grade in 2005. Cohort II attended 5 th 
grade in 2003 and 8 th grade in 2006. 

Cohort Growth analysis was conducted for all elementary and middle schools that 
served the same cohort of students in both Grade 5 and Grade 8. For Cohort I, that was 16 
privately-managed schools and 4 ORS schools. For Cohort II, those numbers were 19 and 
8 for the two types of management, respectively. 4 For the district-wide comparison, all 
students in the cohort were included. 

The schools that were included in the Cohort Growth analyses were comparable 
to all those assigned to private managers and to ORS. The percentage eligible for school 
lunch was 88 percent at the schools included in the Cohort Growth analyses, as compared 
to 89 percent and 85 percent in all the schools managed by for-profit firms and non-profit 
entities, respectively. 5 The percentage of minority students (African-American and 
Hispanic) at the schools included in the Cohort Growth analyses was between 92 and 93 
percent, as compared to 95 percent at all schools managed by the for-profit firms and 97 
and percent at the schools managed by the non-profit firms (Cf. Tables 1 and 2.). 

During the course of three years, some students move from one school to another. 
For Cohort Growth analysis to yield valid results, it must be assumed that the movement 
of students in and out of a particular school is not skewed in one direction or another. 



4 The number of schools differs for the two cohorts, because more schools had an 8 th 
grade class in 2006. A number of the privately-managed and all but one ORS schools 
“grew” grades beyond 5 th grade during the period under observation. It is assumed that 
the students in those new grades came from within the school. However, a separate 
analysis was performed on only those privately-managed schools that had 8 th grade 
classes only throughout the period. Results did not change in any substantively 
significant way. 

5 The percentage receiving subsidized school lunch is for 2005-06, the year for which 
information was available. 
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The educational performance of the new students in that cohort arriving at a school over 
the time period in question cannot differ systematically from those that leave. If a school 
has the reputation of being particularly disadvantaged, or if it has been the focal point of 
considerable conflict, the new students that arrive at the school may well be more 
disadvantaged than those leaving the school. This raises the concern that the Cohort 
Growth approach I am employing could under-estimate performance of students at the 
schools under private management. 6 But that seems unlikely, inasmuch as the racial and 
ethnic composition of the student body at these schools remained essentially unchanged 
throughout the period. For Cohort II, the percentage African American hovered between 
75 and 76 percent; the percentage Hispanic remained at 18 percent, the percentage Asian 
remained at 3 percent, while whites dipped only slightly from 4 to 3 percent of the 
student body over the course of the three years (for full details, see Table 2). In other 
words, the best evidence available suggests little change in the demographic composition 
of the student body at the schools included in the Cohort Growth analysis. 

Results from a Cohort Growth analysis of students moving from 5 th to 8 th grade 
are particularly significant for two reasons. First, the tracking of student progress over 



6 The effect of private management is also under-estimated, if some of those in 8 th grade 
have been within the cohort (i.e., subject to treatment) for less than the full three years. It 
is possible, however, that our findings were induced by an influx into the privately- 
managed schools of more talented students of the same racial and ethnic background. If 
that were to be the case, it would suggest that privately-managed schools had acquired 
positive reputations in the community. The privately-managed schools seem to have done 
a better job of retaining their students between the 5 th and 8 th grades. For Cohort II, 
enrollment decline between 5 th and 8 th grade was 13.9 percent at those schools, compared 
to 20.9 percent at the ORS schools and 18.5 percent districtwide. For Cohort I, the 
enrollment at the privately-managed schools actually increased by 7.8 percent, as 
compared to declines at the ORS schools of 23.8 percent and of 4.8 percent for the 
School District of Philadelphia. It is not known whether these are school effects or due to 
differential changes in neighborhoods, however. 
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more than just a single year allows one to see whether or not continuing progress is being 
made. Second, it is precisely the middle years of schooling where student progress has 
begun to falter nation-wide. On the National Assessment of Educational Progress, 4 th 
graders have been making steady gains for the past fifteen years, but those gains have not 
translated into equivalent gains for 8 th graders. (Peterson, 2006, Figure 3, p. 55). If a 
school is particularly effective between grades 5 and 8, that is a matter well worth 
knowing. 

The State of Pennsylvania provides information on the percentage of students 
performing at 1) below basic levels of proficiency, 2) at basic levels, 3) at full proficiency 
levels and at 4) advanced levels. Given the extreme levels of disadvantage at the schools 
operated by the private managers, two questions are particularly pertinent. First, are 
those schools as or more effective as district schools at lifting the percentage of students 
performing to a basic level of proficiency, a major objective that was identified by the 
State of Pennsylvania at the time the contracting-out policy was put into place? And, 
second, are schools that are privately managed as or more effective as district schools at 
raising the percentage of students performing at or above full proficiency levels? 

The results, as reported in Table 3, indicate that private managers are more 
effective at increasing the percentage of students performing at basic levels of proficiency 
than either schools restructured by the district itself (ORS schools) or the entire School 
District of Philadelphia. They are just as effective as district schools as district schools at 
raising the percentage of students performing at proficient levels. 

Fet us first examine the change in the percentage of students performing at basic 
levels of proficiency. In math, for Cohort I, the percentage performing at a basic level or 
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better jumped by 30 percentage points at the schools managed by the private providers, as 
compared to just 21 percentage points district-wide and just 16 percentage points at the 
ORS schools. For Cohort II, the rise in basic math performance for students at the 
privately-managed schools was 23 percentage points, as compared to 15 percentage 
points district- wide and 12 percent in the ORS schools. 

In reading, the percentage in Cohort I performing at or above the basic level rose 
by 16 percentage points at the privately-managed schools, as compared to 13 percentage 
points district- wide. For this cohort, ORS schools out-performed the other two groups; 
the jump in reading scores at those schools was 21 percentage points. For Cohort II the 
percentage performing at the basic level or better at the privately-managed schools 
climbed by 25 percentage points, as compared to only 17 percent for the district as a 
whole and 15 percent at the ORS schools. In other words, in all comparisons the schools 
managed by the private providers outperformed the overall district trend, and in all but 
one comparison they outperformed the schools restructured by ORS. 

Turning to the percentage of students performing at full proficiency levels, one 
finds that the relative math performance of the students at schools managed by for-profit 
providers was roughly the same as the district average and somewhat better than at the 
ORS schools. For Cohort I, the percentage performing proficiently in math at the 
privately-managed schools improved from grade 5 to grade 8 by 22 points, as compared 
to 21 points district wide and 15 percent at the ORS schools. For cohort II, those 
percentages were 15, 14 and 10 percent, respectively. 

In reading, the percentage of Cohort I at the schools under private management 
performing at the proficient level increased between 5th and 8 th grade by 18 percentage 
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points, as compared to 19 percentage points for the district as a whole and 20 percent at 
the ORS schools. For cohort II, the gains were 21,21, and 19 percentage points 
respectively. 

In sum, private providers, as compared to both the ORS schools and to schools in 
the district as a whole, lifted a comparable percentage of students a full proficiency level 
in reading and math, while doing a more effective job at raising the percentage of 
students to a basic level of proficiency in the two subjects. In interpreting these results, it 
must be kept in mind that students at the for-profit schools were especially 
disadvantaged, and they had to improve by a larger margin in order to reach proficiency 
levels. 

The RAND-RFA Analysis of Private Provider Impacts 

My results for students in their middle years differ from those reported by RAND- 
RFA. The authors find no significant difference between the performance of schools 
administered by private providers and the district- wide performance of students in 
Philadelphia. It is worth taking a careful look at their methodology, as it is presented as a 
quasi-experiment, something that goes well beyond the Cohort Growth analysis we have 
reported. 

Not a Quasi-Experiment. 

RAND-RFA says they have the data on individual students that allow them to 
conduct a 

quasi-experimental analysis that examines the relative achievement of 
students in “treated” schools before and after the management change, comparing 
their trends with the trends of other students in Philadelphia (p. 24). 




15 



Their model for estimating the impact of the privately managed schools is described as 
follows: 

the achievement of students in schools . . . before and after the state 
takeover, and compares their trends with the trends of other students in 
Philadelphia. This fixed-effects approach allows each student to serve as his or 
her own control, thereby factoring out characteristics of students (such as race, 
ethnicity and other unchanging family and student characteristics) that may affect 
student achievement results (pp. xii). 

Even better, each school serves “as its own control, which is particularly important 
because schools were selected for treatment because they had a history of low 
achievement.” 

A number of scholars have conducted quasi-experiments using fixed-effects 
analysis to estimate school impacts on student achievement (see, e. g., Hanushek, Kain 
and Rivkin, 2002). The method is straight-forward: One obtains three or more test 
scores from the same student, two or more before they enter “treatment” (in this case, 
before their school is brought under private management) as well as at least one post- 
treatment score that will allow for an estimation of gain from before to after “treatment.” 
The gains made from pre-treatment are then compared to the gains made post-treatment 
to see whether they differ in any way. Similar information is obtained from a comparison 
group of students who have not been “treated” (in this case, those who did not go to a 
privately-managed school). The results for the treated and untreated groups of students 
are then compared. Had RAND-RFA conducted such a quasi-experiment, its results 
would have considerable credibility. 

But RAND-RFA did not conduct a quasi-experiment. Instead of comparing gains 
during the pre-treatment with gains post-treatment, the researchers simply compared 
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levels of achievement. They defend this approach on the grounds that they would lose as 
many as half the students from their study, because they could only obtain two, not three, 
test scores in the same school for these students. Also, for an unknown number of 
students the two test scores were collected only after treatment had begun. 7 As the 
report’s senior author, Brian Gill, explained in an email to me: 

The analysis regards each year as a separate treatment (with 
effects assumed to be additive across years), so students with two 
achievement observations are included even if they were not present prior to 
the management change. This is why we don't have a dramatic decline in 
sample size as we move from year 1 to year 4. 

Inclusion of the unknown number students observed after treatment had begun 
contributes nothing to the estimate of the impact of private management organizations, 
however, so the number of observations used to estimate the impact of private 

o 

management is much smaller than the number included in the estimation. Even for those 
students whose test scores are observed both pre- and post-treatment, the study is not a 
quasi-experiment, as only levels of achievement are observed prior to treatment. 

When data are limited to the extent that they are in this situation, the standard way 
of estimating school impacts is to regress achievement levels on the treatment variable of 
interest along with a variety of control variables, including the original test score plus a 



H 

RAND-RFA does not report the number of students included in its analysis for which 
they have only post-treatment observations, but they do say that for as many as half the 
students they have only two observations (p. 30), so almost certainly a majority of the 
students in the third and fourth years' analyses were not observed at the school prior to 
treatment. Thus, an unknown number of observations contribute nothing to the estimate 
of treatment. 

g 

I am assuming that RAND-RFA captured both student and school fixed effects in its 
analysis. If both student and school fixed effects were not captured for those students 
observed only in the post-treatment status, then the need for adjustments for additional 
background characteristics is all the more compelling. 




17 



wide variety of other student characteristics (race, ethnicity, gender, eligibility for 
subsidized food lunch, Limited English Proficiency, eligibility for special education, and 
so forth) as well as for the peer-group composition at that school. (For classic examples 
and discussions, see Coleman and Hoffer, 1987; Hoffer, Greeley and Coleman, 1985; 
Goldberger and Cain, 1982; Jencks, 1985.) Unless the number of control variables is 
adequate, most scholars doubt the validity of the results. 9 

That standard methodology was eschewed by RAND-RFA in favor of its own, 
distinctive analysis that allegedly compares each student to himself or herself and each 
school to itself. But one can estimate treatment effects using a fixed effects analysis only 
if one compares gains by students at the same school both before and after treatment. In 
other words, RAND-RFA did not adjust for the many background variables and peer- 
group composition factors that affect student performance. As a result, their study is 
seriously at risk of having under-estimated the impact of the privately-managed schools, 
which were assigned the most disadvantaged schools in Philadelphia. 

High-stakes, curriculum-aligned tests vis-a-vis low-stakes, norm-referenced tests. 

Quite apart from its analytical strategy, RAND-RFA included in its analysis 
information on students for whom at least two test scores were available on any test the 
Philadelphia school districts had administered over the period 2002 to 2006. As 
mentioned previously, thee tests included the PSSA, the Terra Nova, and the Stanford 9. 
A student might have taken one Stanford 9 and one Terra Nova test, or one Stanford 9 
and one PSSA, or any other combination. The analysts assumed that the standard score 

9 An improvement on a traditional OFS regression with background controls would be to 
estimate the propensity of an individual to be in the treatment group and then use other 
information in the data set to estimate values for missing observations. 
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obtained from student performances on any one of the three tests was the same as the 
students would have received, had they taken another one. That is most unlikely since the 
PSSA is a high-stakes, criterion-referenced test aligned to the Pennsylvania state 
curriculum, and principals and teachers have strong incentives to make sure their students 
do well on the PSSA. The Stanford 9 and the Terra Nova, by contrast, are low-stakes, 
nationally-normed tests not aligned to Pennsylvania standards. There are no clear 
consequences for schools that do not perform well on these tests. Whether principals and 
teachers take them seriously is a matter of speculation. 

That the two types of tests do not yield equivalent results is evident from 
information provided in a note to the RAND-RFA report. Some students took both the 
PSSA and the Terra Nova in the spring of the same school year. For these students, 
RAND-RFA (p. 25, note 3) provide information that allows one to calculate the variance 
in student performance on one test that can be explained by the variance in student 
performance on the other. The amount explained was just 50 percent on the reading test 
and 45 percent on the math exam. In other words, only half (or less) of the variation in 
student performance on one exam can be explained by performance on the other, despite 
the fact that the two tests were taken at approximately the same time. 

Treating high- and low-stakes tests as equivalent biases downward the estimate of 
the impact of privately-managed schools, if students and teachers at disadvantaged 
schools are particularly likely not to take low-stakes tests seriously. 10 

10 If the errors resulting from the equating of high-stakes and low-stakes tests are 
uncorrelated with treatment status, this assumption would not introduce bias, only noise. 
Noise reduces the likelihood of identifying impacts, but the more serious concern is the 
possibility students and teachers at disadvantaged schools were especially unlikely to 
attend to tests that had low-stakes. 
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In sum, the RAND-RFA analysis did not adequately adjust for student 
background and peer-group characteristics, and their analysis mixes results from two very 
different types of tests, rather than looking separately at the impact of private 
management on high-stakes and low-stakes tests. As a result, it is at risk of under- 
estimating the impact of private managers, who were responsible for schools serving the 
educationally most disadvantaged students. 

Comparing Philadelphia School District Performance to Schools Elsewhere 
Although my Cohort Growth estimation indicates that privately-managed schools 
were, at least for students in their middle years, more effective than the Philadelphia 
school district as a whole, that does not mean that the Philadelphia district was doing 
badly. As reported earlier, gains in student performance between 5 th and 8 th grade were 
sizeable for schools across the Philadelphia school district (Table 3). 

Those gains in test-score performance are disparaged in the RAND-RFA report, 
which says that: 

. . . after four years, the gains of its low-achieving schools (constituting 
most of the schools in the district) have generally not exceeded the gains of low- 
achieving schools elsewhere in Pennsylvania. 

In other words, as Pennsylvania goes, so goes Philadelphia, and nothing special is 
happening in that particular locale. 

Chapter 3 of the RAND report is devoted to supporting this claim. In that chapter 
RAND says that it wants to discover whether “bringing in private providers to manage 

When a student took two tests in the same spring, RAND-RFA chose to estimate 
school impacts based on the low-stakes test rather than the PSSA, despite the fact that the 
latter is aligned to Pennsylvania state standards. 
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some of Philadelphia’s lowest-achieving schools . . . created additional pressure for 
improvement in district-managed schools (p. 17).” If overall trends in the district are 
positive, then competition from the private providers might be given partial credit for 
spurring that achievement. But if overall trends in Philadelphia are nothing special, then 
any potential district-wide impact from competition can be ignored. 

To see whether or not Philadelphia was making significant strides forward, 
RAND-RFA identified all Pennsylvania schools that had performed in the lowest quartile 
on the PSSA, and then compared gains in PSSA test scores of those schools within 
Philadelphia schools to those elsewhere in the state. After making this comparison, 
RAND concludes that after four years no special gains in Philadelphia had been made. 

The bottom quartile, however, includes a broad range of schools, and one cannot 
be certain whether the schools outside Philadelphia that fall within that bracket are 
similar to those inside Philadelphia. RAND-RFA did not report basic information as to 
the comparability of the two groups. No information was provided as to whether the 
comparisons are being made across two groups of schools that have similar ethnicity, 
Limited English Proficiency, eligibility for the subsidized lunch program, eligibility for 
special education, and other background characteristics. However, one of the authors, 
Brian Gill, generously provided me by email a breakdown of the racial composition of 
the bottom-quartile schools inside and outside of Philadelphia. Among 5 th graders, those 
attending such schools inside Philadelphia were 72 percent African American, as 
compared to 47 percent attending such schools elsewhere in the state. Among 8 th 
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graders, the percentages were 69 percent and 47 percent respectively. Those are very 
large differences between a test and comparison group of schools. 11 

But perhaps not too much should be made of the claims in Chapter 3 of the 
RAND report. Within a month of its release, two of its authors denied they had 
questioned the success of the Philadelphia reforms. In a letter to the editor of the Wall 
Street Journal explaining their research, Brian Gill and Jolley Bruce Christman (2007, p. 
A5) said: “Our study found that achievement in Philadelphia has indeed risen 
dramatically in both privately managed schools and district-managed schools. The 
district's leadership deserves credit for this.” The point of chapter 3, however, was that 
SRC deserved no special credit for the trends in its test score. If the authors now think 
credit for progress is to be given to SRC, can RAND-RFA be sure that the competition 
generated by the private managers was not part of what made for success? 

Resources Available 

Ideally, one would l ik e to assess educational benefits against their costs. How 
much is the cost per pupil at a school? Do students learn more, if they attend schools that 
are better funded? Do increments in funding generate higher levels of learning? 

As desirable as it would be to conduct a cost-benefit analysis of the reforms in 
Philadelphia, that cannot be attempted with the data that the Philadelphia school district 
makes available to the public. Almost nothing is known about the expenditures per pupil 
at any school in Philadelphia, including those operated by the for-profit and non-profit 
managers. Since Philadelphia does not break out expenditures by school, nothing 
conclusive — or even indicative— can be said about the financial playing field in 

11 In an email, Brian Gill says that results change modestly but not significantly when a 
closer racial match is achieved. 




22 



Philadelphia. Per-pupil expenditures may be perfectly identical at all schools, or there 
may be gross disparities. Nor do we have information on the condition of the school 
buildings throughout the district, though a national survey by the U. S. Department of 
Education (2000) reports that schools serving the more disadvantaged tend to be in 
poorer condition. 

Research on other cities also provides strong evidence that schools serving 
disadvantaged students operate with lower expenditures per pupil than other schools 
within the same district (Roza and Hill, 2004; Roza and Hill, 2006). One of the main 
causes of the disparity seems to be the transfer preferences given to more senior teachers, 
who can move to another school in the district if they are the most senior applicant for the 
position available. Since teachers, as they acquire seniority, tend to transfer to schools 
with more advantaged students (Hanushek, Kain and Rivkin, 2004), and since teacher 
salaries are a function of seniority, schools with more advantaged students tend to receive 
more funds from the district office (to pay the more experienced teachers). Schools that 
have low-performing students are generally not given extra resources to compensate for 
their less experienced staff. 

In Philadelphia, a teacher’s transfer rights are a function of his or her seniority, 
and teachers’ salaries are higher, if they are more experienced. So the same conditions 
that generate inequalities in other cities are also present in Philadelphia. The extent of the 
disparity is unknown, however. 

Aware of the salary differentials among teachers at advantaged and disadvantaged 
schools, the private managers, at the time contracts were signed, negotiated funds to 
compensate them for the difference between teacher salaries at their schools as compared 
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to the district average. They also were compensated for the fact that they would be 
performing certain services previously provided by the school district. But the amount 
negotiated appears not to have taken into account the departure of an unusual number of 
experienced teachers from the schools managed by the for-profit firms. As part of its 
efforts to palliate teacher-union opposition, the school district allowed any teacher (not 
just the more senior teachers) to leave a privately-managed school. A high percentage of 
teachers took advantage of the opportunity in 2002, perhaps because teachers were now 
given the opportunity to transfer from a more disadvantaged to a less disadvantaged 
school. In any case, teacher transfer rates at the two for-profit schools increased 
dramatically the year that management responsibilities were shifted. At Edison-run 
schools, the transfer rate increased from 19 percent to 40 percent; at Victory, from 17 to 
40 percent. The school managed by Universal, a non-profit entity, experienced a change 
in its transfer rate from 14 to 36 percent (Neild, Useem and Farley, 2005; Neild and 
Spiridakis, 2003). As a result, the average number of years of experience at these schools 
dropped precipitously the year the managers assumed responsibility for the schools. 

Instead of providing additional resources to ameliorate the problems generated by 
the flight of experienced teachers, the state reduced the compensatory payments to the 
private providers (RAND-RFA, Table 2. 1, p. 7). Meanwhile, the district received from 
the state an increase in per pupil spending of around $1,900 between 2002 and 2005. It is 
unknown how much of that increase was allocated to the schools operated by those under 
for-profit or non-profit management. 

In sum, one has no way of knowing whether per pupil expenditures at the 
privately managed schools were higher or lower than the district average. Nor can it be 
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known whether the change in expenditures at these schools after 2001-02, relative to the 
district average, was positive or negative. 

It is thus surprising that the RAND-RFA study attempted a cost-benefit analysis 
of the reform. The study asserts that “we find no evidence of differential academic 
benefits that would support the additional expenditures on private managers (p. xiv).” In 
a press release accompanying the study, one of the authors, Jolley Christman, invoking 
the language of cost-benefit analysis, said quite explicitly: “investment in private 
management of schools has not paid the expected dividends (RAND, 2007).” Yet the 
study provides no information whatsoever on per pupil expenditures at the privately- 
managed schools or for any other school either before or after the reform. 

In a brief essay written for the Wall Street Journal , I observed that the RAND- 
RFA study lacked “vital information on school spending,” which made a cost-benefit 
analysis impossible (Peterson, 2007). In their reply to this essay, Gill and Christman 
(2007) made no attempt to defend their fiscal analysis (RAND, 2007), a surprising 
quiescence in light of the attention the media have given this aspect of their report. 

Reconciling Studies Reaching Disparate Results 
Using publicly available data to estimate school effectiveness by means of a 
Growth Model, I found that two separate cohorts of students at privately-managed 
schools improved on the PSSA by a larger amount between grades 5 and 8 than did the 
cohorts of students at ORS schools or the same cohort of students district- wide. That 
students during the middle years of learning, a difficult time in many students’ lives, were 
making disproportionate learning gains at privately managed schools is a positive sign. 
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My results differ from those reported by RAND-RFA, which did not find 
disproportionate gains by students at schools managed by private providers. 

When two studies report dissimilar results about the same innovation, readers may 
wish to consider the restrictiveness of the assumptions underlying each analysis. My 
approach has some advantages and certain limitations. Its strength lies in its focus on 
trends in student performance on the PSSA, the high-stakes, curriculum-aligned test used 
to assess school performance in the State of Pennsylvania. It can be expected that 
principals, teachers and students took seriously student performance on this test. 
Principals can be expected to have made certain that teachers explained carefully to 
students the directions for taking the test, increasing the validity of the answers received. 
The credibility of the study is also strengthened by the fact that it relies only upon 
publicly available data, so others can immediately detect any errors I may have 
committed. 

On the other hand, I was able to track only some students in certain grades over 
specific years of time. And I could only track students in 5 th to 8 th grades at schools that 
served students at both those grade levels. Also, I could only track what was happening 
at the school level, not the performances of individual students over time. Nor could I 
control for background characteristics that might have affected student performance. If 
more resourceful parents were removing their students from the very disadvantaged 
schools assigned to the private managers, my findings may under-estimate the impact of 
the private providers. However, the best available evidence suggests that the background 
characteristics of the student body did not change significantly, inasmuch as its racial and 
ethnic composition remained essentially unchanged. 
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To doubt the results I report, one must assume that the schools managed by the 
private providers were attracting a more advantaged constituency or that the results for 
the cohorts passing through Grades 5 through 8 cannot be generalized to younger 
students. 12 

It is also possible that results on a high-stakes test, such as the PSSA, could have 
been skewed by cheating on the part of students or teachers. Such cheating could affect 
both my results as well those reported in the RAND-RFA study, which relies upon a 
mixture of PSSA and low-states test results. To the best of my knowledge, no such 
charges have been made in Philadelphia. 

The main advantage of the RAND-RFA study is that it draws upon data available 
for individual students. In estimating effects of private management, however, it does not 
conduct the quasi-experiment it purports to have undertaken but instead includes an 
unknown number of students within its study that were observed only post-treatment. 
Even for those observed both before and after treatment, it observes only pre-treatment 
levels, not changes in performance, as it has only two observations for many students, not 
the three that are required to conduct a quasi-experiment. Moreover, RAND-RFA 
combine results from very different tests, implicitly assuming the high-stakes PSSA and 
the low-stakes Stanford 9 and Terra Nova are essentially the same test. Yet only half 
the variance (or less) in student performance on one test can be explained by that same 
student’s performance on the other test given at roughly the same time of the year. As a 

12 

The gains at the privately-managed schools could be attributed to mean-reversion 
effects, had 2002 scores been particularly low. However, RAND-RFA (p. 31) shows that 
the changes in test scores between 2001 and 2002 in privately-managed schools did not 
differ from changes district- wide. 

1 T 

' More exactly, it is assumed that any errors are uncorrelated with the intervention. 
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result, the study is at risk of under-estimating private management impacts, because it did 
not control adequately for peer-group composition or family background. Finally, 
RAND-RFA rely upon confidential data that has not been released to other scholars, so 
its analysis cannot either be verified by others or checked to see how sensitive findings 
are to the specifics of the methodology employed. 

There is a third possibility — both studies could be correct for the populations they 
are studying. RAND-RFA’ s analysis might be heavily weighted by observations of 
students in grades 3 through 5, while my observations are limited to students in grades 6 
through 8. 14 If findings from both studies are correct despite the limitations of each, then 
one would conclude that private management has advantages for middle school students 
but no impact on younger ones, one way or another. 

For results from the two studies to be reconciled, however, the School District of 
Philadelphia would need to authorize RAND to make available to other scholars the data 
that they have analyzed, so secondary analyses can ascertain whether the original 
findings are sensitive to the specifics of the RAND-RFA methodology. 

On other matters discussed in the RAND-RFA report, I have no new information. 
Neither I nor RAND have obtained information on expenditures per pupil for specific 
schools in Philadelphia, privately managed or not. It is thus impossible to assess whether 
or not Philadelphians were getting their monies worth from the privately-managed 
schools. Finally, equivalent schools from other parts of Pennsylvania were not included 
in that portion of the RAND-RFA analysis that attempts to assess the overall impact of 
school reform in Philadelphia. 

14 RAND-RFA provides no information on the distribution of students in their study by 
grade level. 
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